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Abstract — Steganography is an information hiding application 
which aims to hide secret data imperceptibly into a commonly 
used media. Unfortunately, the theoretical hiding asymptotical 
capacity of steganographic systems is not attained by algorithms 
developed so far. In this paper, we describe a novel coding method 
based on Z2Z4 -linear codes that conforms to ±1 -steganography, 
that is secret data is embedded into a cover message by distorting 
each symbol by one unit at most. This method solves some 
problems encountered by the most efficient methods known today, 
based on ternary Hamming codes. Finally, the performance of 
this new technique is compared with that of the mentioned 
methods and with the well-known theoretical upper bound. 

I. Introduction and preliminary results 

Steganography is a scientific discipline within the field 
known as data hiding, concerned with hiding information into 
a commonly used media, in such a way that no one apart from 
the sender and the intended recipient can detect the presence 
of embedded data. A comprehensive overview of the core 
principles and the mathematical methods that can be used for 
data hiding can be found in (6). 

An interesting steganographic method is known as matrix 
encoding, introduced by Crandall |3| and analyzed by Bier- 
brauer et al. [ 1 1 . Matrix encoding requires the sender and 
the recipient to agree in advance on a parity check matrix 
H, and the secret message is then extracted by the recipient 
as the syndrome (with respect to H) of the received cover 
object. This method was made popular by Westfeld [8], who 
incorporated a specific implementation using Hamming codes 
in his F5 algorithm, which can embed t bits of message in 
2* — 1 cover symbols by changing, at most, one of them. 

There are two parameters which help to evaluate the per- 
formance of a steganographic method over a cover message 
of N symbols: the average distortion D = jf, where R a is 
the expected number of changes over uniformly distributed 
messages; and the embedding rate E = 4?, which is the 
amount of bits that can be hidden in a cover message. In 
general, for the same embedding rate a method is better when 
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the average distortion is smaller. Following the terminology 
used by Fridrich et al. p), the pair [D, E) will be called CI- 
rate. 

Furthermore, as Willems et al. in |(9), we will also assume 
that a discrete source produces a sequence x = (x\, . . . , xn), 
where N is the block length, each Xi € N = {0, 1, . . . , 2 B -1}, 
and B g {8, 12, 16} depends on whether the source is a 
grayscale digital image, or a CD audio, etc. The message 
s e {1, ...,M} we want to hide into a host sequence x 
produces a composite sequence y = /(x, s), where y = 
{yi, ■ ■ - Un) an d each yi £ H. The composite sequence y is 
obtained from distorting x, and the distortion will be assumed 
to be a squared-error distortion (see (9j). In these condi- 
tions, if information is only carried by the least significant 
bit (LSB) of each xi, the appropriate solution comes from 
using binary Hamming codes j8), improved using product 
Hamming codes J7). For larger magnitude of changes, but 
limited to 1, that is, y.j = xi + c, where c € {0, +1, —1}, the 
situation is called "±l-steganography", and the information is 
carried by the two least significant bits. It is known that the 
embedding becomes statistically detectable rather quickly with 
the increasing amplitude of embedding changes. Therefore, 
our interest goes to avoid changes of amplitude greater than 
one. With this assumption, our steganographic scheme will be 
compared with the upper bound from [9| for the embedding 
rate in "±l-steganography", given by H(D)+D, where H(D) 
is the binary entropy function H{D) = —D\og 2 (D) — (1 — 
D) log 2 (l - D) and < D < 2/3 is the average distortion. A 
main purpose of steganography is designing schemes in order 
to approach this upper bound. 

In most of the previous papers, "±l-steganography" has 
involved a ternary coding problem. Willems et al. [9| proposed 
a schemed based on ternary Hamming and Golay codes, which 
were proved to be optimal. Fridrich and Lisonek |4| proposed 
a method based on rainbow colouring graphs which, for some 
values, outperformed the scheme obtained by direct sum of 
ternary Hamming codes with the same average distortion. 
However, both methods from (9) and Bl show a problem when 
dealing with extreme grayscale values, since they suggest 
making a change of magnitude greater than one in order to 
avoid having to apply the change xi — 1 and xi + 1 to a host 



sequence of value Xj = and x t = 2 B — 1, respectively. 
Note that the kind of change they propose would obviously 
introduce larger distortion and therefore make the embedding 
more statistically detectable. 

In this paper we also consider the ±l-steganography. Our 
new method is based on perfect Z2Z4-linear codes which, 
although they are not linear, they have a representation using 
a parity check matrix that makes them as efficient as the 
Hamming codes. As we will show, this new method not 
only performs better than the one obtained by direct sum 
of ternary Hamming codes from (9), but it also deals better 
with the extreme grayscale values, because the magnitude of 
embedding changes is under no circumstances greater than 
one. 

To make this paper self-contained, we review in Section [H] 
a few elementary concepts on perfect Z2Z4-linear codes, 
relevant for our study. The new steganographic method is 



presented in Section III whereas an improvement to better 



deal with the extreme grayscale values problem is given in 
Section IV Finally, the paper is concluded in Section [V] 



II. Perfect Z 2 Z 4 -linear codes 

In general, any non-empty subgroup C of Z 2 x Zf is a 
Z 2 Z 4- additive code, where Z 2 denotes the set of all binary 
vectors of length a and Z4 is the set of all /3-tuples in Z4. 



Let C = $(C), where $ 
map 



Z? x Z 







Z 2 is given by the 



. . .,u a \vi,.. .,vp) = (tii,.. . ,u a \4>(vi), . . .,<j>(vp)), 

where 0(0) = (0, 0), 0(1) = (0, 1), 0(2) = (1, 1), and 0(3) = 
(1,0) is the usual Gray map from Z4 onto Z|. 

A Z2Z4-additive code C is also isomorphic to an abelian 
structure like Z 7 xZ[ Therefore, C has \C\ = 2 7 4 5 codewords, 
where 2 1+s of them are of order two. We call such code C a 
7L,2^ ^- additive code of type (a,/?; 7, ft) and its binary image 
C is a Z2Z4- linear code of type (a, (3; 7, 8). Note that the 
Lee distance of a Z2Z4-additive code C coincides with the 
Hamming distance of the Z2Z4-linear code C = $(C), and 
that the binary code C does not have to be linear. 

The Z 2 Z 4 -additive dual code of C, denoted by C , is 
defined as the set of vectors in Z 2 x Z4 that are orthogonal 
to every codeword in C, being the definition of inner product 



4 the following: 



(u,v) 



2(2J UiVi 



a+/3 

■ u i v i e Z4 ' 

j=a+l 



(1) 



where u, v e Z 2 x Z4 and computations are made considering 
the zeros and ones in the a binary coordinates as quaternary 
zeros and ones, respectively. 

The binary code Cj_ = ^(C- 1 -), of length n = a + 2/3, is 
called the ^^i-dual code of C. 

A Z2Z4-additive code C is said to be perfect if code C — 
<p(C) is a perfect Z 2 Z4-linear code, that is all vectors in Z 2 are 
within distance one from a codeword and the distance between 
two codewords is, at least, three. 



For any m > 2 and each 8 g {0, . . . , L^fJ} there exists 
a perfect Z2Z4-linear code C of binary length n = 2 m — 1, 
such that its Z2Z4-dual code is of type (a, f3;-f,5), where 
a = 2 m - 5 -1,0 = 2 m ~ x - 2" 1 -- 5 - 1 and 7 = m - 2(5 (note 
that the binary length can be computed as n — a + 2(3). The 
above result is due to |2) and it allows us to write the parity 
check matrix H of any Z2Z4-additive perfect code for a given 
value of 8. Matrix H can be represented taking all possible 
vectors in Z 2 x Zf, up to sign changes, as columns. In this 
representation, there are a columns which correspond to the 
binary part of vectors in C, and (3 columns of order four which 
correspond to the quaternary part. We agree on a representation 
of the a binary coordinates as coordinates in {0, 2} € Z 4 . 

III. Steganography based on perfect Z 2 Z 4 -LINEAR 

CODES 

Take a perfect Z2Z4-linear code and consider its Z2Z4-dual, 
which is of type (a, j3; 7, S). As stated in the previous section, 
this gives us a parity check matrix H which has 7 rows of 
order two and 8 rows of order four. 

For instance, for m = 4 and according to [2|, there are 
three different Z2Z4-additive perfect codes of binary length 
n = 2 4 — 1 = 15 which correspond to the possible values of 
5 G {0, . . . , [f J} = {0, 1, 2}. For (5 = 0, the corresponding 
Z2Z4-additive perfect code is the usual binary Hamming code, 
while for 8 = 2 the Z2Z4-additive perfect code has parameters 
a = 3, (3 = 6, 7 = 0, 8 = 2 and the following parity check 
matrix: 



H = 



2 2 
2 2 



11112 
10 12 3 1 



(2) 



Let for i g {1, . . . , a + (3}, denote the i-th column 
vector of H. Note that the all twos vector 2 is always one 
of the columns in H and, for the sake of simplicity, it will 
be written as the column hi. We group the remaining first ct 
columns in H in such a way that, for any 2 < i < (a + l)/2, 
the column vector h2i is paired up with its complementary 
column vector h^ = h-a+i, where h^ = h2i + 2. 

To use these perfect Z2Z4-additive codes in steganography 
take N = 2™- 1 = s±l + (3 and let x = (x x , ...,x N ) be 
an iV-length source of grayscale symbols such that x t e H = 
{0, 1, . . . , 2 B — 1}, where, for instance, B = 8 for grayscale 
images. We assume that a grayscale symbol Xi is represented 
as a binary vector (vn, . . . , vu, v oi ) such that 



B/2-1 

E 

j=0 



1 (v( 2 j + l)i , V( 2 j)i) ■ 4 J , 



(3) 



where 0^ 1 () is the inverse of Gray map. We will use the 
two least significant bits (LSBs), vu,VQi, of every grayscale 
symbol Xi in the source, for i > 1, as well as the least 
significant bit u i of symbol Xi to embed the secret message. 

Each symbol x. L will be associated with one or more column 
vectors h; in H, depending on the grayscale symbol: 



1) Grayscale symbol X\ is associated with column vector 
hi by taking the least significant bit Vqi of x\. 

2) Grayscale symbol a;,, for 2 < i < (a + l)/2, is 
associated with the two column vectors and hj, 
by taking, respectively, the two least significant bits, 

V U ,V i, Of Xi. 

3) Grayscale symbol Xj, for a < j < N, is associated 
with column vector hj + ^ a _ 1 y 2 by taking its two least 
significant bits vij , VQj and interpreting them as an 
integer number <p~ 1 {v\j, voj) in Z4. 

In this way, the given TV-length packet x of symbols is 
translated into a vector w of a binary coordinates and (3 
quaternary coordinates. 

The embedding process we are proposing is based on the 
matrix encoding method (3), (8). The secret message can be 
any vector s £ Zj x ^A- Vector e ■ hi indicates the changes 
needed to embed s within x; that is Hw T + e ■ hi = s, where 
e is an integer whose value will be described bellow, Hw T 
is the syndrome vector of w and h; is a column vector in 
H. The following situations can occur, depending on which 
column hj needs to be modified: 

1) If hj = hi, then the embedder is required to change the 
least significant bit of x\ by adding or substracting one 
unit to/from x\, depending on which operation will flip 
its least significant bit, i>oi- 

2) If hj is among the first a column vectors in H and 2 < 
i < a, then e can only be e = 1. In this case, since h^ 
was paired up with its complementary column vector hi, 
then this situation is equivalent to make (vu, 1 + foi) or 
(1 +vu,voi), where vu and voi are the least significant 
bits of the symbol Xi which had been associated with 
those two column vectors. Hence, after the inverse of 
Gray map, by changing one or another least significant 
bit we are actually adding or subtracting one unit to/from 
Xi. Note that a problem may crop up at this point when 
we need to add 1 to a symbol Xi of value 2 B — 1 or, 
likewise, when Xi has value and we need to subtract 
1 from it. 

3) If hi is one of the last f5 columns in H we can see 
that this situation corresponds to add e 6 {0, 1, 2, 3} to 
£»-(£»- 1)/2- Note that because we are using a Z 2 Z 4 - 
additive perfect code, e will never be 2. Hence, the 
embedder should add (e = 1) or subtract (e = 3) one 
unit to/from symbol a;,_( Q _i)/2- Once again, a problem 
may arise with the extreme grayscale values. 

Example 1: Let x = (239, 251, 90, 224, 226, 187, 229, 180) 
be an TV-length source of grayscale symbols, where x.^ £ 
{0, . . . , 255} and TV = 8, and let H be the matrix in The 
source x is then translated into the vector w = (010|202310) 
in the way specified above. Let s = (°) be the vector 
representing the secret message we want to embed in x. We 
then compute iiw T = (^) and see, by the matrix encoding 
method, that e = 3 and h^ = hg. According to the method just 
described, we should apply the change x$ — 1. In this way, x% 



becomes x s = 179, and then w = (010|202313), which has 
the expected syndrome ("). 

As already mentioned at the beginning of this paper, the 
problematic cases related to the extreme grayscale values are 
also present in the methods from Q and j§], but their authors 
assume that the probability of gray value saturation is not too 
large. We argue that, though rare, this gray saturation can still 
occur. However, in order to compare our proposal with these 
others we will not consider these problems either until the next 
section. Therefore, we proceed to compute the values of the 
average distortion D and the embedding rate E. 

Our method is able to hide any secret vector s £ Z 2 x Zf into 
the given TV symbols. Hence, the embedding rate is (7 + 25) 

7 + 25 m 
bits per TV symbols, E = = =-. 

Concerning the average distortion D, we are using a perfect 
code of binary length 2 m — 1, which corresponds to TV = 
2 m ~ 1 grayscale symbols. There are TV — 1 symbols Xi, for 
2 < i < TV, with a probability 2/2™ of being subjected to 
a change; a symbol x\ with a probability l/2 m of being the 
one changed; and, finally, there is a probability of 1/2" 1 that 

neither of the symbols will need to be changed to embed s. 

2TV — 1 2 m - 1 

Hence, D = 



TV2 

The described 

2TV-1 1 



n 2 2 ' 

method 



has 



log(TV) 



C/-rate [D m ,E n 



, y y , ■ y 1 j where TV = 2™ 1 and m is any 

integer m > 2. We are able to generate a specific embedding 
scheme for any value of m but not for any Ci-rate. 

With the aim of improving this situation, convex combina- 
tions of C/-rates of two codes related to their direct sum are 
extensively treated in |4|. Actually, it is possible to choose 
the D coordinate and cover more C/-rates by taking convex 
combinations. Therefore, if D is a non-allowable parameter 
for the average distortion we can still take Di < D < D 2 , 
where D\ , D 2 are two contiguous allowable parameters, and 
by means of the direct sum of the two codes with embedding 
rate E\ and E 2 , respectively, we can obtain a new C7-rate 
(D,E), withL> = A£>i + (1-A)£>2 and E = \E 1 + (l-\)E 2 . 
From a graphic point of view, this is equivalent to draw a line 
between two contiguous points (Di,Ei) and (D 2 ,E 2 ), as it 
is shown in Fig. [T] 

In the following theorem we claim that the CJ-rate of 
our method improves the one given by direct sum of ternary 
Hamming codes from (9). 

Theorem 1: For m > 4, the C/-rate given by the method 
based on Z2Z4-additive perfect codes improves the C7-rate 
obtained by direct sum of ternary Hamming codes with the 
same average distortion. 

Proof: Optimal embedding (of course, in the allowable 
values of D) can be obtained by using ternary codes, as it 
is shown in 19). The C/-rate of these codes is (D^^E^) = 

2 2a Y 

1 for any integer fi. Our method, based on 

perfect codes, has Ci-rate 

io g (Tvr 



3m' 3m - 1 
Z2Z4-additive 
2TV-1 1 



2TV 2 



TV 



has Ci-rate (D m ,E m ) = 
for any integer m > 2 and TV = 



Take, for any m > 2, two contiguous values for fj, such that 
D^+i < D m < D M and write D m = \D^ +1 + (1 - X)D^, 
where < A < 1. 

We want to prove that, for m > 4, we have E m > 
XEf l +i+(l—X)Ep, which is straightforward. However, since it 
is neither short nor contributes to the well understanding of the 
method, we do not include all computations here. The graphic 
bellow compares the C/-rate of the method based on ternary 
Hamming codes with that one based on Z2Z4-additive perfect 
codes. As one may see in this graphic, for some values of 
the average distortion D, the scheme based on Z2Z4-additive 
perfect codes has greater embedding rate E than the one based 
on ternary Hamming codes. ■ 

Remark: The same argumentation can be used and the same 
conclusion can be reached taking q instead of 3 and comparing 
our method with the method described in RJ. 



within x, we can now make some variations on the kinds of 
changes to be done for the specific problematic cases: 

• If hi is among the first a columns in H, for 2 < i < 
a, and the embedder is required to add 1 to a symbol 
Xi — 2 B — 1, then the embedder should instead substract 
1 from Xi as well as perform the appropiate operation 
(+1 or —1) over X\ to have Vqi flipped. Likewise, if 
the embedder is required to substract 1 from a symbol 
Xi — 0, then (s)he should instead add 1 to Xi and also 
change x\ to flip Uoi. 

• If hi is one of the last (3 columns in H, and the embedder 
has to add 1 to a symbol Xi — 2 B — 1, (s)he should instead 
substract 1 from the grayscale symbol associated to h; 
and also change X\ to flip Vq\. If the method requires 
substracting 1 from X{ = 0, then we should instead add 
1 to the symbol associated to hj and, again, change x\ 
to flip vq\. 
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Fig. 1. C-f-rate (D,E), for B = 8, of steganographic methods based on 
ternary Hamming codes and on Z2Z4-additive perfect codes. 



IV. Solving the extreme grayscale values 

PROBLEM 



In Section III 



we described a problem which may raise 
when, according to our method, the embedder is required to 
add one unit to a source symbol Xi containing the maximum 
allowed value (2 B — 1), or to substract one unit from a symbol 
Xi containing the minimum allowed value, 0. To face this 
problem, we will use the complementary column vector 
of columns in matrix H, where = 3hi + 2 and is 
among the last (3 columns in H. Note that and can 
coincide. 

The first a column vectors in H will be paired up as before, 
and the association between each Xi and each column vector 
hi in H will be also the same as in Section III However, given 
an N -length source of grayscale symbols x — (x\, . . . , iejv), 
a secret message s e x and the vector e • h^, such that 
Hw T + e ■ hi = s, indicating the changes needed to embed s 



Example 2: Let x = (239,251,90,224,226,187,229,0) 
be an TV-length source of grayscale symbols, where Xi E 
{0, ...,255} and N = 8, and let H be the matrix 
As in Example [T] the packet x is translated into vector 
w = (010|202310), and s = (°). However, note that now 
we are not able to make x$ — 1 because xg = 0. Instead 
of this, we will add one unit to X3, which is the symbol 
associated with h 9 = h 4 , and substract one unit from X\ 
so as to have its least significant bit flipped. Therefore, 
we obtain x = (238,251,91,224,226,187,229,0) and then 
w = (110|302310), which has the desired syndrome. 

The method above described has the same embedding rate 

m 

E = r as the one from Section 



III 



but a slightly worse 
average distortion. We will take into account the squared-error 
distortion defined in [9| for our reasoning. 

As before, among the total number of grayscale symbols 
N = 2 m_1 , there are N-l symbols x h for 2 < i < N, with 
a probability 2/2 m of being changed; a symbol x\ with a 
probability 1 /2 m of being the one changed; and, finally, there 
is a probability of 1/2™ that neither of the symbols will need 
to be changed. 

As one may have noted in this scheme, performing a certain 

change to a symbol Xi, associated with a column in H, 

has the same effect as performing the opposite change to the 

grayscale symbol associated with h; and also changing the 

least significant bit vqi of x±. This means that with probability 

2 9 a 2 we will change a symbol Xi, for 2 < i < N, a 

magnitude of 1; and with probability we will change 

two other symbols also a magnitude of 1. Therefore, R a = 

2/ 2 s — 2 2\ 1 
(N— 1)— - — — g h 2— g- I + — - and the average distortion 

V ' Om. I OH OH/ Om ° 



is thus D 



2 B 
2N - 1 



'2 B J 

N-l 

2 B-2 



N2 m 

has CJ-rate (D m ,E m ) = 



Hence, the described method 



2N- 



-1 + 
2N 2 



N-l 

2 B-2 




As we have already mentioned, the problem of grayscale 
symbols with and 2 B — 1 values was previously detected in 



both Q and (9). With the aim of providing a possible solution 
to this problem, the authors suggested to perform a change of 
a magnitude greater than 1. However, the effects of doing this 
were are out of the scope of ±l-steganography. 

In the remainder of this section we proceed to compare the 
C7-rate of our method with the C/-rate that those methods 
would have if their proposed solution was implemented. 

The scheme presented by Willems et al. 19) is based on 
ternary Hamming codes, which are known to have length 
n = (3 M — l)/2, where n denotes the number of parity check 
equations. Let us assume that whenever the embedder is re- 
quired to perform a change (+1 or —1) that would lead the cor- 
responding symbol Xi to a non-allowed value, then a change of 
magnitude 2 (—2 or +2) is made instead. While the embedding 



rate E of this scheme would still be E 



2/zlog(3) 
3^ 



the 



average distortion D would no longer be D — The 
actual expected number of changes R a is computed by noting 
that a symbol will be changed with probability 3 ~ , and 
will not with probability Among the cases in which a 
symbol would need to be changed, there is a probability of 
2 2 b 2 that a symbol will be changed a magnitude of 1, and a 
probability of ^ that it will be changed a magnitude of 2. By 



the squared-error distortion, R a - 

2 / 3 



3^-1 / 2-2 
3^ 



and therefore D — 



3'' 



1 



2 B-1 



Fridrich and Lisonek propose in their paper to pool the 
grayscale symbols source x into cells of size d, then rainbow 
colour these cells and apply a g-ary Hamming code, where 
q = 2d + 1 is a prime power. They measure the distortion by 
counting the maximum number of embedding changes, thus 
just considering the covering radius of the g-ary Hamming 
codes. However, we will now consider the average number of 
embedding changes (see |5|). As Willems et at., the authors 
from |4| also suggest to perform a change of magnitude 
q — 1 > 1 to solve the extreme grayscale values problem. 
If this is done, the embedding rate would still be the same, 
2fi \og(q) 



E 



D 



qf* - 1 

_2_ / 2 B -2 



but the average distortion would now be 

2_ ( 1+ q(q-2Y 
q^ 



2 B-1 



One can see in Fig. [2] how our steganographic method for 
Z2Z4-additive perfect codes deals with the extreme grayscale 
values problem, for some values of D, better than those using 
ternary Hamming codes (q = 3) from |4} and (9J. 

V. Conclusions 

In this paper, we have presented a new method for ±1- 
steganography, based on perfect Z 2 Z4-linear codes. These 
codes are non-linear but still there exists a parity check matrix 
representation that makes them efficient to work with. 

As we have shown in sections [ill] and |IV] this new scheme 
outperforms the one obtained by direct sum of ternary Ham- 
ming codes (see J9|) as well as the one obtained after rainbow 
colouring graphs by using g-ary Hamming codes for q = 3. 
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Fig. 2. C/-rates (D, E), for B = 8, of steganographic methods based on 
ternary Hamming codes and on Z^Z^-additive perfect codes, when they are 
dealing with the extreme grayscale values problem described in Section |IV| 



If we consider the special cases in which the technique 
might require to substract one unit from a grayscale symbol 
containing the minimum allowed value, or to add one unit to 
a symbol containing the maximum allowed value, our method 
performs even better than those aforementioned schemes. This 
is so because unlike them, our method never applies any 
change of magnitude greater than 1, but two changes of 
magnitude 1 instead, which is better in terms of distortion. 
Therefore, our method makes the embedding less statistically 
detectable. 

As for further research, since the approach based on product 
Hamming codes in [7 1 improved the performance of basic LSB 
steganography and the basic F5 algorithm, we would also 
expect a considerable improvement of the Ci-rate by using 
product Z2Z4-additive codes or subspaces of product Z2Z4- 
additive codes in ±l-steganography. 
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